OC-782K: Knowledge Graph of "Scientometrics" modelled according to the OpenCitations Data Model
Creators
- 1. FIZ-Karlsruhe
- 2. University of Bologna
Description
This dataset is a knowledge graph extracted from a triplestore covering information about the journal Scientometrics and modelled according to the OpenCitations Data Model. The original triplestore is available here. This KG was extracted for a research project on knowledge graph embeddings (KGEs) for author disambiguation. Structural triples of the knowledge graph are split into training, testing and validation for applying representation learning methods. Textual literals and numeric literals were stored separately in order to implement multimodal approaches for KGEs (see arXiv:1802.00934). For the same reason, textual literals and numeric literals are already stored into sentence embeddings and a numeric matrix respectively in the files textual_literals.npy and numeric_literals.npy. The file and_eval.json contains the evaluation dataset used for evaluating our AND architecture. For the script used to gather this dataset see the GitHub repository: https://github.com/sntcristian/and-kge/tree/main/open-citations.
Files
OC-782K.zip
Files
(230.7 MB)
Name | Size | Download all |
---|---|---|
md5:fe4df744ee5f00f97670fb1893ba5466
|
230.7 MB | Preview Download |